Detecting and correcting the bias of unmeasured factors using perturbation analysis: a data-mining approach
نویسنده
چکیده
BACKGROUND The randomized controlled study is the gold-standard research method in biomedicine. In contrast, the validity of a (nonrandomized) observational study is often questioned because of unknown/unmeasured factors, which may have confounding and/or effect-modifying potential. METHODS In this paper, the author proposes a perturbation test to detect the bias of unmeasured factors and a perturbation adjustment to correct for such bias. The proposed method circumvents the problem of measuring unknowns by collecting the perturbations of unmeasured factors instead. Specifically, a perturbation is a variable that is readily available (or can be measured easily) and is potentially associated, though perhaps only very weakly, with unmeasured factors. The author conducted extensive computer simulations to provide a proof of concept. RESULTS Computer simulations show that, as the number of perturbation variables increases from data mining, the power of the perturbation test increased progressively, up to nearly 100%. In addition, after the perturbation adjustment, the bias decreased progressively, down to nearly 0%. CONCLUSIONS The data-mining perturbation analysis described here is recommended for use in detecting and correcting the bias of unmeasured factors in observational studies.
منابع مشابه
An approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کاملOutlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کاملAn Integrated DEA and Data Mining Approach for Performance Assessment
This paper presents a data envelopment analysis (DEA) model combined with Bootstrapping to assess performance of one of the Data mining Algorithms. We applied a two-step process for performance productivity analysis of insurance branches within a case study. First, using a DEA model, the study analyzes the productivity of eighteen decision-making units (DMUs). Using a Malmquist index, DEA deter...
متن کاملConcept drift detection in business process logs using deep learning
Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...
متن کاملCorrecting the stress-strain curve in hot compression test using finite element analysis and Taguchi method
In the hot compression test friction has a detrimental influence on the flow stress through the process and therefore, correcting the deformation curve for real behavior is very important for both researchers and engineers. In this study, a series of compression tests were simulated using Abaqus software. In this study, it has been employed the Taguchi method to design experiments by the factor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 14 شماره
صفحات -
تاریخ انتشار 2014